Pattern Matching in DCA Coded Text

نویسندگان

  • Jan Lahoda
  • Borivoj Melichar
  • Jan Zdárek
چکیده

A new algorithm searching all occurrences of a regular expression pattern in a text is presented. It uses only the text that has been compressed by the text compression using antidictionaries without its decompression. The proposed algorithm runs inO(2 ·||AD||+nc+r) worst case time, where m is the length of the pattern, AD is the antidictionary, nC is the length of the coded text and r is the number of found matches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pattern Matching Machine for Text Compressed Using Finite State Model

The classical pattern matching problem is to nd all occurrences of patterns in a text. In many practical cases, since the text is very large and stored in the secondary storage, most of the time for the pattern matching is dominated by data transmission of the text. Therefore the text compression can speed-up the pattern matching. In this framework it is required to develop an e cient pattern m...

متن کامل

Phrase-Based Pattern Matching in Compressed Text

Byte codes are a practical alternative to the traditional bit-oriented compression approaches when large alphabets are being used, and trade away a small amount of compression effectiveness for a relatively large gain in decoding efficiency. Byte codes also have the advantage of being searchable using standard string matching techniques. Here we describe methods for searching in byte-coded comp...

متن کامل

Efficient String Matching with k Mismatches

Given a text of length n, a pattern of length m and an integer k, we present an algorithm for finding all occurrences of the pattern in the text, each with at most k mismatches. The algorithm runs in 0{k[mlQgTn + n) time. 1. INTEODUCTION The problem of string matching xuith k misTnatchss is defined as follows. Suppose we are given a text of length n , a pattern of length m and an integer k . Fi...

متن کامل

An Integrated Technique for Production Data Analysis (PDA) With Application to Mature Fields

The most common data that engineers can count on, especially in mature fields, is production rate data. Practical methods for production data analysis (PDA) have come a long way since their introduction several decades ago and fall into two categories: decline curve analysis (DCA) and type curve matching (TCM). DCA is independent of any reservoir characteristics, and TCM is a subjective procedu...

متن کامل

Approximate Pattern Matching Over the Burrows-Wheeler Transformed Text

The compressed pattern matching problem is to locate the occurrence(s) of a pattern P in a text string T using a compressed representation of T , with minimal (or no) decompression. In this paper, we consider approximate pattern matching directly on Burrow-Wheeler transformed (BWT) text which is a critical step for a fully compressed pattern matching algorithm on a BWT based compression algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008